Design and Validation of Proteome Measurements
نویسنده
چکیده
Proteomics is a branch in biology that aims to comprehensively characterize a proteome. Mass spectrometry based proteomics has proven to be the most powerful approach to achieve this goal. This thesis introduces statistical concepts to optimally design and validate shotgun proteomics experiments and thereby enables to efficiently achieve reliable and extensive proteome coverage. The first part reports methods to estimate false discovery rates for peptide and protein identifications. These approaches enabled to reliably and comprehensively identify unusually modified protein variants. It turned out that these variants contribute to a significant fraction of the spectral evidence. This work presents a generalized target-decoy approach to estimate false discovery rates for protein identifications. This work shows evidence that the reliability of protein identifications in large studies has so far been largely overestimated and provides guidelines to compile identifications at well defined confidence. This part concludes with formulating a generic framework to compare protein inference engines based on protein identification false discovery rates. A systematic comparison of thousands of protein inference variants revealed that simple approaches yield optimal inference performance. The second part develops a nonparametric Bayesian approach to optimally design shotgun proteomics studies. Therefore the proteome coverage prediction task is introduced. An extended infinite Markov model is presented to perform proteome coverage prediction for simple shotgun proteomics experiments is presented. To capture the intricate similarities among peptide distributions arising in integrated shotgun proteomics studies, this work developed the general concept of the fractal Dirichlet process that augments the hierarchical Dirichlet process by introducing self-referential base measures. The fractal process is successfully applied to predict proteome coverage for integrated shotgun proteomics datasets. Rational stop criteria for these studies are discussed and evaluated by means of the proteome coverage prediction approaches. Finally the proteome coverage
منابع مشابه
Optimal Operation of a Three-Product Dividing-Wall Column with Self-Optimizing Control Structure Design
This paper deals with optimal operation of a three-product Dividing-Wall Column (DWC). The main idea is to design a control structure, through a systematic procedure for plantwide control, with an objective to achieve desired product purities with the minimum use of energy. Exact local method is used to find the best controlled variables as single measurement or combination of measurements ...
متن کاملLDA Experimental Data of Three-Poster Jet Impingement System
During its near-ground hovering phase a Short Take-Off and Vertical Landing (STOVL) aircraft creates a complex three-dimensional flow field between jet streams, the airframe surface and the ground. A proper understanding and numerical prediction of this flow is important in the design of such aircraft. In this paper an experimental facility, used to gather validation data suitable for testing C...
متن کاملEvaluation of Leaf Proteome in Wheat Genotypes Under Drought Stress
Drought stress in plants, the change (increase or decrease) in the production of plant proteins. Proteomics in recent years one of the most powerful tools that help us to study the changes in protein In order to investigate the proteome of wheat leaves in response to terminal drought, two genotypes susceptible and resistant wheat genotypes were evaluated under irrigated (non-stress) and rain-fe...
متن کاملProteome Analysis of Rat Hippocampus Following Morphine-induced Amnesia and State-dependent Learning
Morphine’s effects on learning and memory processes are well known to depend on synaptic plasticity in the hippocampus. Whereas the role of the hippocampus in morphine-induced amnesia and state-dependent learning is established, the biochemical and molecular mechanisms underlying these processes are poorly understood. The present study intended to investigate whether administration of morphine ...
متن کاملI-49: Human Y Chromosome ProteomeProject
The success of the Human Genome Project (HGP) has provided a blueprint for the approximately 20,000 gene-encoded proteins potentially active in all of the hundreds of cell types that make up the human body. Yet we still have limited knowledge about a majority of the gene-encoded proteins which are the “building blocks of life” and “cellular machinery”. It is estimated that for nearly half of th...
متن کامل